Kernel Ridge Regression (KRR) is a simple yet powerful technique fornon-parametric regression whose computation amounts to solving a linear system.This system is usually dense and highly ill-conditioned. In addition, thedimensions of the matrix are the same as the number of data points, so directmethods are unrealistic for large-scale datasets. In this paper, we propose apreconditioning technique for accelerating the solution of the aforementionedlinear system. The preconditioner is based on random feature maps, such asrandom Fourier features, which have recently emerged as a powerful techniquefor speeding up and scaling the training of kernel-based methods, such askernel ridge regression, by resorting to approximations. However, randomfeature maps only provide crude approximations to the kernel function, sodelivering state-of-the-art results by directly solving the approximated systemrequires the number of random features to be very large. We show that randomfeature maps can be much more effective in forming preconditioners, since undercertain conditions a not-too-large number of random features is sufficient toyield an effective preconditioner. We empirically evaluate our method and showit is highly effective for datasets of up to one million training examples.
展开▼